Rank | Count | Beginning |
---|---|---|
6360 | 441 | यस |
7448 | 422 | यो |
8869 | 258 | सन् |
1563 | 160 | उनले |
1726 | 157 | उनी |
6504 | 114 | यसको |
1434 | 106 | उनको |
3619 | 92 | तर |
4610 | 82 | नेपालको |
7888 | 64 | र |
7366 | 56 | यी |
2059 | 55 | एक |
1292 | 54 | उक्त |
3851 | 47 | त्यस |
3891 | 39 | त्यसपछि |
7011 | 39 | यसले |
7238 | 39 | यहाँ |
6447 | 38 | यसका |
6888 | 36 | यसमा |
6975 | 35 | यसलाई |
2551 | 34 | केही |
4812 | 34 | पछि |
6935 | 32 | यसरी |
9858 | 30 | हाल |
1128 | 28 | आफ्नो |
4010 | 28 | त्यसैले |
4107 | 28 | त्यो |
9331 | 28 | साथै |
4703 | 27 | नेपालमा |
1400 | 26 | उनका |
In the next four subsections show the most frequent sentence beginnings consisting of N words, N=1, 2, 3, 4. In this subsection we start with N=1.
The most frequent word-N-grams at the beginning of sentences give some insight into sentence composition.
Especially for N=1, we only need a small corpus to identify the most frequent sentence beginnings.
select substring_index(sentence, ' ', 1) as beg, count(*) as cnt from sentences group by substring_index(sentence, ' ', 1) order by cnt desc limit 50;
4.3.1.2 Most Frequent Sentence Beginnings II
4.3.1.3 Most Frequent Sentence Beginnings III
4.3.1.4 Most Frequent Sentence Beginnings IV
4.3.1.1 Most Frequent Sentence Endings I
4.3.1.2 Most Frequent Sentence Endings II
4.3.1.3 Most Frequent Sentence Endings III
4.3.1.4 Most Frequent Sentence Endings IV